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imprinted genes 
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ABSTRACT 

Background Genomic imprinting is allelic restriction of 
gene expression potential depending on parent of origin, 
maintained by epigenetic mechanisms including parent 
of origin-specific DNA methylation. Among 
approximately 70 known imprinted genes are some 
causing disorders affecting growth, metabolism and 
cancer predisposition. Some imprinting disorder patients 
have hypomethylation of several imprinted loci (HIL) 
throughout the genome and may have atypically severe 
clinical features. Here we used array analysis in HIL 
patients to define patterns of aberrant methylation 
throughout the genome. 

Design We developed a novel informatic pipeline 
capable of small sample number analysis, and profiled 
10 HIL patients with two clinical presentations 
(Beckwith-Wiedemann syndrome and neonatal diabetes) 
using the lllumina Infinium Human Methylation450 
BeadChip array to identify candidate imprinted regions. 
We used robust statistical criteria to quantify DNA 
methylation. 

Results We detected hypomethylation at known 
imprinted loci, and 25 further candidate imprinted 
regions (nine shared between patient groups) including 
one in the Down syndrome critical region [WRB) and 
another previously associated with bipolar disorder 
[PPIEL). Targeted analysis of three candidate regions 
{NHP2L1, WRB and PPIEL) showed allelic expression, 
methylation patterns consistent with allelic maternal 
methylation and frequent hypomethylation among an 
additional cohort of HIL patients, including six with 
Silver-Russell syndrome presentations and one with 
pseudohypoparathyroidism 1B. 
Conclusions This study identified novel candidate 
imprinted genes, revealed remarkable epigenetic 
convergence among clinically divergent patients, and 
highlights the potential of epigenomic profiling to 
expand our understanding of the normal methylome and 
its disruption in human disease. 



INTRODUCTION 

Genomic imprinting is the epigenetic regulation of 
gene expression by parent of origin. DNA methyla- 
tion at imprinting control regions (ICRs) is the 
most robust and widely studied epigenetic modifi- 
cation regulating imprinting. Genomic imprinting 
requires resetting of DNA methylation in the 



germline and its subsequent resistance to erasure 
during the transition from germ cell to early 
embryonic development.^ ^ While methylation at 
ICRs is ubiquitous and permanent, the effects on 
DNA methylation and expression of surrounding 
genes are dependent on other factors such as tissue 
and developmental stage. ^ 

Many imprinted loci were identified through the 
developmental disorders caused by their disruption, 
and particularly the discovery of uniparental 
disomy and other genetic errors in rare human dis- 
orders of imprinting."^ ^ But the total number of 
imprinted genes is not know^n. Recent efforts to 
identify imprinted genes by murine transcriptome 
analysis yielded high numbers of transcripts w^ith 
allelic bias.^ How^ever, this observation has been 
disputed and may be attributable to various tech- 
nical sources of skew^ed allelic representation in 
RNA-seq data'^ and, more recently, genome-w^ide 
bisulfite sequencing has allow^ed direct assessment 
of allele-specific methylation;^ taken together, these 
observations suggest that our current catalogue of 
imprinted genes is approaching completion, w^ith 
few^ novel germline imprints remaining to be dis- 
covered (http://igc.otago.ac.nz). ^ 

Many know^n imprinted genes are regulators of 
groM^th and development, and their expression at 
critical developmental times is functionally hemizy- 
gous. Therefore, alteration of effective copy 
number can cause developmental disorders. To 
date, eight imprinting disorders (IDs) have been 
identified: Beckw^ith- Wiedemann syndrome (BWS; 
MIM #130659), Silver-Russell syndrome (SRS; 
MIM #180860), transient neonatal diabetes 
(TND) mellitus (MIM #601410), Prader-WilH syn- 
drome (MIM #176270), Angelman syndrome 
(MIM #105830), matUPD14-like (Temple syn- 
drome) and patUPD14-like syndromes, and pseu- 
dohypoparathyroidism IB (PHP-IB; MIM 
#103580). Aetiological mechanisms of IDs include 
UPD, copy number variation, mutation of the 
expressed copy, or epimutation secondary to or 
independent of a predisposing genetic mutation. 
A subset of patients w^ith IDs have epimutations 
affecting multiple imprinted loci across the genome 
(multi-locus methylation disorders or hypomethyla- 
tion of imprinted loci (HIL)^^). The reported rate 
of HIL in BWS is 38% (with ICR2 
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hypomethylation), 57% in TND (with PLAGLl hypomethyla- 
tion) and 10% in SRS (with ICRl hypomethylation) /^"^"^ There 
is no standard quantification for hypomethylation at the affected 
loci, though tissue mosaicism is thought to account for the vari- 
ation observed between patients. In some of these disorders, a 
shared pattern of methylation derangement can be detected, 
and underlying genetic mutations have been identified; in 
other cases, the cause (s) remain unknown. 

In order to identify novel imprinted regions, several groups 
have used genome-wide methylation analyses of patients with 
UPD and HIL, commonly using the Infinium Human 
Methylation27 BeadChip array. The potential limitations of 
this approach include the limited coverage of this array, and the 
lack of suitable bioinformatic pipelines to study large methyla- 
tion changes in small study cohorts, as currently available pipe- 
lines are designed to assess modest DNA methylation changes in 
large study cohorts.^^"^"^ To address these limitations, we used 
the Infinium Human Methylation450 BeadChip array, and 
developed a new analysis pipeline capable of robust analysis of 
small study groups with large methylation changes. 

Here, we analysed the methylomes of 10 HIL patients with 
two clinical presentations (five BWS and five neonatal diabetes), 
compared with normal controls, and identified hypomethylated 
regions, including three hitherto undescribed candidate 
imprinted regions. 

MATERIALS AND METHODS 
Study population (ethics) 

Peripheral blood leucocyte DNA of patients with IDs was assessed 
by methylation-specific PGR (msPCR) at 1 1 maternally methylated 
loci, as described (see online supplementary table SI; the majority 
of these patients have been previously reported in Poole et al^^). 
Those patients with hypomethylation at loci additional to the 
primary locus for their presenting disorder were classified as HIL, 
and subgrouped using the epigenetic profiles of these 1 1 maternal 
imprinted loci. It was apparent that five patients with TND and 
five with BWS showed an overlapping pattern of hypomethyla- 
tion: TND -HIL samples showed hypomethylation at PLAGLl^ 
DIMS, IGF2R and IGFIR differentially methylated regions 
(DMRs), with some additional overlap of hypomethylation at 
MEST, KCNQIOTI and GRBIO, and BWS-HIL patients shared 
hypomethylation of KCNQIOTI, PLAGLl, IGFIR and MEST, 
with NESPAS and GNAS hypomethylation observed in 2/5 
patients. These patients were selected for further analysis to deter- 
mine whether they had additional shared hypomethylation 
patterns. 

All TND -HIL patients were negative for ZFPS7 mutations 
and BWS-HIL patients negative for NLRP2 mutations. The 
ethical approval for the use of these samples was obtained 
through the study 'Imprinting Disorders Finding Out Why?', 
approved by Southampton and South West Hampshire Research 
Ethics committee 07/H0502/85 and 'Mapping clinical and 
molecular studies of 6q24 transient neonatal diabetes' approved 
by Wiltshire Research Ethics committee 08/H0104/15. 

Control population 

Control group 1 (N=221) and control group 2 (N=245) 
anonymous batch-matched healthy samples from an unrelated 
study were used to generate control methylation profiles for the 
analysis of TND -HIL and BWS-HIL cases, respectively. Control 
group 1 samples were mixed gender and source material, with 
198 peripheral blood leucocytes DNA samples derived from 
cohort members and their partners and 23 cord blood leuco- 
cytes DNA samples from their offspring whereas control group 



2 contained 221 peripheral blood leucocyte DNA samples from 
female subjects at 18 years of age from an unselected population 
birth cohort. Ethical approval was obtained from the Isle of 
Wight Local Research Ethics Committee (now named the 
National Research Ethics Service, NRES Committee South 
Central — Southampton B) for the 18 years follow-up (06/ 
Q1701/34) and NRES Committee South Central— Hampshire B 
(09/H0504/129) for the third generation study. 

Validation samples 

Methylation array findings were validated by targeted testing of 
DNA and RNA samples. DNA was derived from two hydatidi- 
form mole cell lines, peripheral blood leucocytes of 92 anon- 
ymised controls, four anonymised normal trios and 34 
anonymised individuals diagnosed with Down syndrome, and 
patients with IDs: five TND-HIL, six BWS-HIL, seven SRS- 
HIL, one PHP-HIL, five ZFPS7 mutation cases presenting with 
TND and nine patients with hypomethylation at only one locus 
(two TND with PLAGLl hypomethylation, two BWS patients 
with KCNQIOTI hypomethylation, four SRS patients with 
ICRl hypomethylation and one with UPD7mat). These samples 
were obtained under the same ethical approval as the study 
group and previously reported. Nucleic acids (DNA and 
RNA) from human embryonic and fetal tissues were obtained 
with informed consent and with permission from the 
Southampton and South West Hampshire joint Research Ethics 
Committee, staged according to the Carnegie classification or 
foot length. 

Array-based methylation analysis 

1250 ng of Qubit 2.0 Eluorometer quantified DNA was bisulfite- 
treated using the EZ 96-DNA methylation kit (Zymo Research, 
California, USA), following the manufacturer's standard proto- 
col. Genome-wide DNA methylation was assessed by The 
Oxford Genomics Centre using the Illumina Infinium 
HumanMethylation450 BeadChip (Illumina, Inc., CaHfornia, 
USA). Arrays were processed using the manufacturer's standard 
protocol with multiple identical control samples assigned to each 
bisulfite conversion batch to assess assay variability and samples 
randomly distributed on microarrays to control against batch 
effects. The BeadChips were scanned using a BeadStation, and 
the methylation level (p value) calculated for each queried CpG 
locus using the Methylation Module of BeadStudio software. 

Data preprocessing and quality control 

A pipeline was developed using the Illumina methylation ana- 
lysis (IMA) package within the R statistical analysis environment 
(http://vww.r-project.org).^^ Data from five TND-HIL and five 
BWS-HIL samples were grouped and run in this pipeline inde- 
pendently. Sites were removed that contain any missing values. 
All samples met minimal inclusion criteria for analysis, as each 
sample had >75% sites with a detection p value <lxlO~^. In 
all, 216 sites were removed from TND-HIL study and 106 from 
BWS-HIL study, as these had detected p value >0.05 in at least 
75% of the sample analysed. Among these removed sites, 68 are 
common between the two study groups. Initial QC-plots (see 
online supplementary figure SI) for both of the studies showed 
that male and female samples clustered together via unsuper- 
vised clustering resulting from gender-specific biases in methyla- 
tion level. Therefore, probes on X and Y chromosomes 
were removed to discard any sex bias within the samples. The 
number of sites annotated by probe types that were removed by 
the initial quality control step is shown in online supplementary 
table S2. A total of 76.88% probes remained for the TND-HIL 
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analysis and 81.82% remained for the BWS-HIL analysis after 
the preprocessing. 

The p-values were converted to M-values by logit transform- 
ation as M-value increases the cogency of statistical tests for dif- 
ferential methylation.^^ Quantile normalisation was used to 
normalise signal intensities for each probe and reduce inter- 
array variation.^^ 

Illumina Human 450 K methylation array uses two different 
chemistries, Infinium I and II, to enhance the breadth of cover- 
age. Infinium I uses two probes per CpG locus (both methylated 
and unmethylated query probes), whereas in Infinium II only 
one probe (either methylated or unmethylated) per CpG locus 
is required. To correct these differences in the results between 
these two chemistries, peak correction was applied.^'^ No batch 
correction was required as all the cases and controls for individ- 
ual experiments had been processed in the same batch. 

Low sample number differential methylation analysis 

Stringent criteria were set to select candidate imprinted 
sequences hypomethylated in patients, with p values adjusted 
using false discovery rate to ensure statistical robustness.^^ 
Individual CpGs were selected when hypomethylated in patients 
compared with controls, with an adjusted p value of 
> 1.33x10"'^, and an M-value between -f-1 and -1 (equivalent 
to 0.26>p>0.7) in normal controls. Genes containing two CpGs 
meeting these criteria and within <2000 nucleotides were 
deemed to be candidate DMRs. 

Initially paired t test and one-sample t test were used for stat- 
istical analysis; however, these methods did not reveal any 
probes meeting our stringent criteria, probably because of the 
low sample number. Therefore, we explored the linear model 
technique, used for analysis of microarray data,^^ which models 
the significant part of the data and then allows the fitted coeffi- 
cients to be compared in as many ways as possible. Crawford 
and Garthwaite proved that using a larger control group can 
produce significant statistical results even for a single case pro- 
vided that appropriate statistical methods are applied."^ ^ 
Therefore, for both of the case groups, we used larger numbers 
of controls (n>200) against smaller numbers of cases (n=5). 
The linear model achieved convincing statistical outcomes from 
our pipeline, with efficient identification of known and novel 
hypomethylated loci for both TND-HIL and BWS-HIL case 
groups. Using the same criteria, only one region of hypermethy- 
lation was found in TND-HIL and four in BWS-HIL; these 
were not further examined as they were not relevant to this 
study (data not shown). 

Targeted validation testing 

msPCR analysis of the 1 1 maternally methylated loci used previ- 
ously described primers and protocols. msPCR primers for 
candidate loci NHP2L1, PPIEL and WRB are listed in online 
supplementary table S3. 

Bisulfite sequencing 

Bisulfite-specific primers were designed to amplify regions of 
80-180 nt containing 7-12 CpG dinucleotides, using PyroMark 
software Vl.O (Qiagen). Primer sequences are Hsted in online 
supplementary table S3. Amplicons were generated (Phusion 
DNA polymerase New England BioLabs) from two patients and 
two controls, Hgated into pCR2.1 (Invitrogen); 2\xL of each 
ligation was transformed into chemically competent TOP 10 
cells (Invitrogen). Positive clones were selected on agar plates 
supplemented with 40 |xg/mL X-gal and 100 |xg/mL ampicillin. 
Overall, 24 white colonies were selected from each plate and 



suspended in 50 |iL dH20 prior to denaturation (94°C for 
5 min). An amount of 1 \xL of the denatured bacterial solution 
was used as a PGR template for M13 primer amplification 
(Phusion DNA polymerase New England BioLabs). These reac- 
tions were treated with ExoSAP to degrade remaining primers, 
prior to sequencing with M13 forward and reverse primers. 
Very similar results were obtained for the two controls and the 
two patients; results from only one patient and one control are 
presented in the figures. 

Restriction digest sequencing 

To determine whether methylation was allele-specific or restricted 
by parent of origin, SNPs were analysed in proximity to DMRs in 
DNA from family trios. Heterozygous SNPs were identified and 
their inheritance determined by Sanger sequencing in DNA of off- 
spring and parents. To determine methylation status, 200 ng of 
offspring DNA was digested before amplification with restriction 
enzymes BstUl or Mcrbc (New England Biolabs) according to 
manufacturer's instructions, as described.^ ^ 

Expression analysis 

Coding SNPs were identified within novel imprinting gene can- 
didates WRB and NHP2L1 (rsl3230 and rs8779, respectively). 
These were used to identify heterozygous samples collected fol- 
lowing termination of pregnancy for a non-medical/social 
reason at gestational age 8-12 weeks with RNA-matched 
samples for a range of tissues (primers listed in online supple- 
mentary table S3). Allele-specific expression was then assessed 
in available heterozygous embryonic tissues. 

cDNA was prepared with Superscript III reverse transcriptase 
(Invitrogen) from 500 ng embryonic RNA. RT-PCR primers 
were designed to detect different isoforms of the candidate 
genes (see onHne supplementary table S3) and were ampHfied 
using Phusion DNA polymerase (New England BioLabs). 

RESULTS 

Statistical analysis of 450 K methylation array data 

We developed a new analysis pipeline to detect methylation 
changes, with stringent selection criteria, capable of robust analysis 
of our small epigenetically defined groups (see Materials and 
methods section). The pipeline employed the linear modelling com- 
monly used for microarray analysis and compared small patient 
numbers against a large control group to produce significant statis- 
tical results.^^ Using stringent selection criteria, 34 hypomethy- 
lated regions were identified in the BWS-HIL cohort and 21 regions 
in TND-HIL (figure 1, see online supplementary tables S4 and S5). 

The hypomethylated regions generated from both groups 
included several known imprinted genes (table 1, see online sup- 
plementary tables S4 and S5), both within and outside the 11 
loci previously assessed in targeted analysis. The p values 
observed for known loci were proportionate to the degree of 
hypomethylation predicted from msPCR analysis of the patients 
groups. This is most clearly demonstrated at the disease-specific 
loci, where the lowest adjusted p value for the TND locus 
PLAGLl was more significant in TND-HIL than BWS-HIL 
(4.84x10"^^"^ vs 4.39x10"^^) (see online supplementary figure 
S2B, supplementary tables S4 and S5) and, conversely, the BWS 
locus KCNQIOTI had a lower p value in BWS-HIL than 
TND-HIL cohort (4.27x10"^^ vs 9.47x10"^^) (see online 
supplementary tables S4 and S5). These p values were consistent 
with the degree of hypomethylation detected by targeted testing 
(see online supplementary table SI). 

To assess the effect of merging patient data on the ability of 
the pipeline to detect hypomethylation, we used SNRPN, the 
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1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 

Figure 1 Distribution of known and candidate differentially methylated CpG sites in (A) Beckwith-Wiedemann syndrome (BWS) and (B) transient 
neonatal diabetes (TND). In each case, the pie chart to the left shows CpG sites compared between cases and controls (in grey), including those 
meeting criteria for differential methylation; the pie chart to the right highlights hypomethylated CpG sites, including those in known 
clinically-relevant loci (red), loci reported to be imprinted (pink) and loci not currently reported to be imprinted, that is, candidate loci (blue). 
(C) Chromosome ideogram showing the distribution across all autosomes of known and candidate differentially methylated loci. Black dots represent 
known imprinted genes that were shown to be hypomethylated in the TND patient group in this study; the green dots represent known imprinted 
genes shown to be hypomethylated in the BWS patient group in this study. Red and blue squares correspond to candidate imprinted loci in TND-HIL 
and BWS-HIL, respectively. The names of imprinted loci associated with imprinting disorders are displayed next to loci, in black, where they were 
detected as hypomethylated in patient samples. 



only locus identified by msPCR in both patient groups with 
hypomethylation of a single patient (see online supplementary 
table SI). Using our criteria, hypomethylation of SNRPN was 
resolved in the TND-HIL, but not the BWS-HIL cohort where 
the hypomethylation was less severe (table 1, see onHne supple- 
mentary table S6). Thus, the pipeline was proved to resolve 
moderate hypomethylation in a single individual, validating the 
analysis of these hyper-rare patients as a group, rather than 
attempting analysis of single patients, which presents significant 
statistical challenges. 

In addition to the known imprinted regions, 23 and 11 novel 
candidate DMRs were detected in the BWS-HIL and TND-HIL 
cohorts, respectively. Nine of these candidate DMRs were shared 
between BWS-HIL and TND-HIL patient groups (table 1). It is 



noteworthy that the coverage of probes was broadly higher in 
known imprinted genes than novel candidates (eg, 54 in PLAGLl, 
267 in KCNQl and 73 in MEST, compared with 24 in ERLIN2, 
28 in WRB, 23 in NHP2L1 and 13 in UDC728448\ reducing the 
likelihood of finding such novel candidates by chance. 

Validation of differential methylation region candidates 

Candidates were prioritised for follow-up based on prior evi- 
dence of allele-specific methylation in primary cell lines and 
hypomethylation in sperm (from Fang et aP^) which would be 
consistent with maternal imprinting (this eliminated JAKMIPl 
and GLP2R). Further inspection highlighted three candidates 
(NHP2L1, WRB and PPIEL) where hypomethylation affected 
sequence contexts characteristic of imprinted genes (figures 2 
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Table 1 Hypomethylated regions shared between TND-HIL and BWS-HIL patients 



Candidate 


Chr 


Gene name 


CpG island 


BWS 






TND 






Probe region* 


No. probest 


Lowest p value* 


Probe region* 


No. probest 


Lowest p value* 


Novel 


1 


LOC728448/ma 


No 


40 024 971-40 025 411 


3 


1.47E-18 


40 024 971-40 025 232 


2 


3.09E-22 


candidate DMRs 


4 


JAKMIP1 


Yes 


6 107 021-6 107 339 


4 


2.48E-16 


6 107 021-6 107 339 


4 


5.83E-36 




7 


SVOPL 


Yes 


138 348 774-138 349 443 


3 


6.30E-41 


138 348 774-138 349 443 


3 


7.21 E-20 




9 


FANCC 


Yes 


98 075 481-98 075 492 


2 


8.29E-58 


98 075 481-98 075 492 


2 


7.28E-55 




17 


GLP2R 


No 


9 729 250-9 729 424 


3 


3.33E-16 


9 729 250-9 729 422 


4 


1.81 E-23 




21 


WRB 


Yes 


40 757 691-40 758 208 


2 


2.51 E-20 


40 757 691-40 758 208 


4 


6.71 E-29 




8 


LOC728024/ERLIN2 


No 


37 605 517-37 605 783 


4 


3.87E-40 


37 605 359-37 605 978 


6 


2.69E-42 




18 


LOCI 001 30522/PARD6G-AS1 


Yes 


77 905 355-77 905 947 


3 


1.01 E-1 9 


77 905 298-77 905 947 


9 


4.38E-71 




22 


NHP2L1 


Yes 


42 078 217-42 078 723 


6 


4.08E-15 


42 078 217-42 078 723 


6 


4.25E-54 


Imprinted — not associated with ID 


1 


DIRASB^^ 


Yes 


68 512 539-68 517 273 


21 


6.69E-31 


68 512 539-68 517 273 


20 


5.45E-64 




6 




Yes 


3 849 235-3 849 818 


17 


1.70E-18 


3 849 272-3 849 818 


17 


1.64E-39 




15 


IGFIR^' 


No 


99 408 636-99 409 506 


5 


2.23E-15 


99 408 636-99 409 957 


6 


1.04E-36 




19 


ZNF331'' 


Yes 


54 040 774-54 058 085 


11 


1.39E-40 


54 040 813-54 058 085 


10 


9.13E-53 




20 


L3MBTL^^ 


Yes 


42 142 417-42 143 502 


13 


1.32E-17 


42 142 417-42 143 489 


18 


7.60E-25 


Imprinted — associated with ID 


6 


PLAGL1 


Yes 


144 328 421-144 329 909 


14 


1.06E-55 


144 328 482-144 329 909 


15 


1.22E-129 




7 


MEST 


Yes 


130 130 187-130 133 110 


42 


6.12E-42 


130 130 383-130 133 110 


42 


1.73E-45 




11 


KCNQ1 


Yes 


2 715 837-2 722 258 


26 


1.14E-73 


2 720 463-2 722 119 


9 


4.86E-13 



Datasets from five patients with BWS-HIL and five with TND-HIL were compared with datasets from 245 and 211 batch-matched normal controls, respectively. Probes with M-values between -1 and +1 in controls and relative hypomethylation in patients 
with a p value of <1.33E-7 were identified. This subset was further filtered by minimal criteria for a hypomethylated locus, that is, >2 hypomethylated probes spaced by <2000 nucleotides. Candidate regions that meet these criteria in both BWS-HIL and 
TND-HIL are listed in this table. 

*Genome position of most proximal and distal probe fulfilling hypomethylation criteria. 
tNumber of probes within the locus fulfilling hypomethylation criteria. 
^Minimum p value among probes fulfilling hypomethylation criteria. 

BWS, Beckwith-Wiedemann syndrome; DMR, differentially methylated region; HIL, hypomethylation of imprinted loci; ID, imprinting disorder; TND, transient neonatal diabetes. 
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Figure 2 DNA methylation and expression analysis of NHP2L1 in patients with Beckwith-Wiedemann syndrome (BWS) and transient neonatal 
diabetes (TND). (A) Screengrab from UCSC genome browser representing the NHP2L1 gene and imprinted locus. The subregion highlighted in (B) is 
marked by a red double-ended arrow. Small numbers under the screengrab denote the exon numbering as used for expression analysis in (E); red 
asterisk indicates the position of the SNP analysed in (E). Note that NHP2L1 is transcription from right to left with respect to genomic orientation. 
(B) Divergent DNA methylation between normal controls and patients, detected by methylation array. Solid lines denote M-values (left axis). Dashed 
lines represent p values of methylation difference between patients and controls (right axis). Black line represents normal controls; blue lines 
represent averaged methylation of five BWS patients; red lines represent averaged methylation of five TND patients. (C) Illustrative electropherogram 
from methylation-specific PCR experiment showing difference in DNA methylation between a single patient and control. Amplicons derived from 
methylated and unmethylated DNA are marked by red and blue lines, respectively. (D) Summary of bisulfite cloning and sequencing experiment 
comparing a patient with a normal control. The circles represent CpG dinucleotides within a sequence amplified after bisulfite modification, with 
filled and empty circles representing methylated and unmethylated DNA sequences respectively. The number to the right indicates the number of 
times the sequence was detected in individual clones. In no case were methylated and unmethylated CpG dinucleotides detected within a single 
clone. (E) Allele-specific expression analysis oi NHP2Lh Top electropherogram represents genomic sequencing across rs8779 showing heterozygous 
SNP. Lower electropherograms represent sequencing of RT-PCR products from pancreatic cDNA, ampliified from exons 1-4 (bialielic expression) and 
2-4 (monoallelic). 
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Figure 3 DNA methylation and expression analysis of WRB in patients with Becl<with-Wiedemann syndrome (BWS) and transient neonatal 
diabetes (TND). (A) Screengrab from UCSC genome browser, representing the WRB gene and imprinted locus. The subregion highlighted in (B) is 
marked by a red double-ended arrow. Small numbers under the screengrab denote the exon numbering as used for expression analysis in (E); red 
asterisk indicates the position of the SNP analysed in (E). (B) Divergent DNA methylation between normal controls and patients, detected by 
methylation array. Solid lines denote M-values (left axis). Dashed lines represent p values of methylation difference between patients and controls 
(right axis). Black line represents normal controls; blue lines represent averaged methylation of five BWS patients; red lines represent averaged 
methylation of five TND patients. (C) Illustrative electropherogram from methylation-specific PCR experiment, showing difference in DNA methylation 
between a single patient and control. Amplicons derived from methylated and unmethylated DNA are marked by red and blue lines, respectively. (D) 
Summary of bisulfite cloning and sequencing experiment comparing a patient with a normal control. The circles represent CpG dinucleotides within 
a sequence amplified after bisulfite modification, with filled and empty circles representing methylated and unmethylated DNA sequences, 
respectively. The number to the right indicates the number of times that sequence was detected in individual clones. In no case were methylated 
and unmethylated CpG dinucleotides detected within a single clone. (E) Allele-specific expression analysis of WRB. Top electropherogram represents 
genomic sequencing across rs1060180 showing heterozygous SNP. Lower electropherograms represent sequencing of RT-PCR amplicons in human 
fetal tissues as stated. 
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and 3, see online supplementary figure S3 A). msPCR on a panel 
of 96 anonymised normal control samples showed methylation 
levels at all three loci to be stable in the normal population (SD 
NHP2L1=0.1S, WRB = 0.23 and PPIEL = 0.22: data not shown). 
Analysis of complete hydatidiform mole (no methylation at 
maternally imprinted loci) showed complete hypomethylation in 
all three loci (data not shown). 

DNA methylation at the candidate loci was then confirmed by 
msPCR in four of the five test HIL patients in each cohort 
(figures 2C and 3C; see online supplementary figure S3C; 
online supplementary table SI). For the two other patients, 
insufficient DNA remained for further analysis). All showed 
hypomethylation of at least one candidate locus: 2/4 TND-HIL 
patients were hypomethylated at all 3 loci, while 3/4 BWS-HIL 
and 1/4 TND-HIL patients showed hypomethylation at 2-3 
loci. We then explored the methylation of these loci in DNA 
from further ID patients, including those with and without HIL, 
and those with hypomethylation of maternal and paternal DNA. 
Four of five additional TND-HIL patients and five of six add- 
itional BWS-HIL patients had hypomethylation at one or more 
loci, thus validating these as regions frequently affected by 
hypomethylation in TND-HIL and BWS-HIL patients (see 
online supplementary table SI). Less expected was the observa- 
tion that NHP2L1, WRB and PPIEL candidate DMRs also 
showed hypomethylation in SRS-HIL patients (6/7, 4/7 and 1/7, 
respectively) and WRB hypomethylation in 1/1 PHP-HIL 
patient. No hypomethylation was observed at any of the loci in 
five patients with ZFPS7 mutations nor in nine patients with an 
ID affecting only one locus. This suggested that hypomethyla- 
tion at these loci was restricted to HIL patients, rather than 
being widespread among ID patients. 

Additionally, WRB methylation was analysed in 34 anon- 
ymised DNA samples from individuals diagnosed with Down 
syndrome. In all, 31 samples showed partial hypermethylation 
in a ratio consistent with the presence of one additional methy- 
lated allele of WRB; two showed partial hypomethylation con- 
sistent with one additional unmethylated allele of WRB; and 
one showed methylation equivalent to normal controls (see 
online supplementary figure S4). We were unable to confirm the 
parental origin of the additional chromosome 21 for these 
patients. However, given that 95% of trisomy 21 is of maternal 
origin,^"^ we infer that this ratio of apparent hypermethylation 
and hypomethylation, at 31:2 Down syndrome patients 
(94%: 6%), is consistent with DNA methylation being present 
on the maternal allele of WRB. 

Parent of origin-specific methylation were investigated at 
NHP2L1 and PPIEL candidate DMRs using methylation-specific 
restriction digest and sequencing. These results were consistent 
with maternal inheritance of the methylated allele at both candi- 
date DMRs (see online supplementary figures S5 and S6). To 
further demonstrate that DNA methylation was discrete, that is, 
concentrated on one allele rather than homogeneously distribu- 
ted, we performed bisulfite cloning and sequencing of NHP2L1, 
WRB and PPIEL DMRs. Amplicons from each candidate region 
were cloned and sequenced in two controls and two patients 
identified by msPCR as having hypomethylation. This confirmed 
the presence of fully-methylated and fully-unmethylated ampli- 
cons in controls, and relative hypomethylation in patient 
samples for all three candidate regions (figures 2D and 3D; see 
online supplementary figure S2D). 

Validation of allele-specific expression 

To determine whether the hypomethylation observed at the 
three candidate DMRs correlated with allele-specific expression 



of the associated genes, we analysed expression of transcripts in 
human foetal nucleic acids. We identified informative SNPs in 
NHP2L1 and WRB in the genomic DNA of 8-12 week embryos 
(we could not identify informative coding SNPs in PPIEL). 
Matched RNA from multiple tissues was reverse-transcribed and 
amplified by RT-PCR using isoform-specific primers. 

For NHP2L1, monoallelic expression was observed for exon 
2-4 specific transcripts and biallelic expression for exon 1-4 
specific transcripts (figure 2E) in all tested tissues for four 
embryos (data not shown). Biallelic expression of WRB was 
observed in the majority of tissues tested with both exon 1-6 
and 2-6 specific transcripts. However, sporadic monoallelic 
expression was observed with opposing allelic expression in the 
skeletal muscle and aorta of a single embryo (exon 1-6 specific 
primers: figure 3E), and monoallelic expression in 1/3 adrenal 
tissues assayed (exon 2-6 specific primers; data not shown). 

DISCUSSION 

The data presented here demonstrate the successful use of 
whole genome methylation array technology to explore the 
methylome in two rare epigenetically defined cohorts of patients 
with IDs characterised by HIL. 

Our small cohort size necessitated the development of a new 
pipeline capable of robust analysis of small group sizes. While 
other statistical analyses could not significantly detect hypo- 
methylated loci, the linear model we appHed in the pipeline, 
with the stringent criteria, detected differential methylation 
robustly. These loci were validated by the evidence from the 
prior partial epigenetic profiling of our patient groups and low 
p values. Moreover, these p values were proportionate to the 
degree of hypomethylation predicted from the known patient 
epimutations. This allowed us to use the pipeline confidently to 
predict novel imprinted regions. 

Consistent with the aim of this study, novel candidate DMRs 
were identified that share several attributes of imprinted genes. 
From the nine candidate DMRs identified, follow-up of three 
candidates did not validate hypomethylation in the patients ana- 
lysed by 450 K methylation array. These loci showed hypo- 
methylation in additional TND-HIL and BWS-HIL patients, but 
not in patients with hypomethylation restricted to one primary 
locus or in normal controls. Hypomethylation of all loci in indi- 
viduals with SRS-HIL and WRB in a PHP-HIL patient expanded 
the range of patients observed to have hypomethylation at these 
regions. Additionally, allele-specific methylation and parent- 
specific methylation analysis was consistent with monoallelic 
methylation of maternal origin for all three candidate DMRs, 
with NHP2L1 and WRB showing evidence of allele-specific 
expression. 

It is noteworthy that patterns of hypomethylation were 
shared between HIL patients with divergent clinical presenta- 
tions. This is a surprising observation, but consistent with a 
shared cause of their syndromic presentation. It has become 
apparent in recent years that IDs with common phenotypes are 
associated with multiple imprinted genes (eg, H19 and 
KCNQIOTI in BWS, and H19 and chr7 in SRS: refs^"^ ^^). It is 
also apparent that some patients with HIL have clinical features 
inconsistent with their epigenotype.^^ There may be 

several reasons for this phenotype-epigenotype divergence, but 
the most likely is somatic mosaicism, which is common among 
IDs and strongly modifies clinical presentation. It is therefore 
possible that common underlying causes, including environmen- 
tal insults, primary epimutations and trans-acting mutations, 
may cause HIL disorders with highly variable phenotypic fea- 
tures. Comprehensive epigenetic profiling may be required to 
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stratify HIL patients with common epimutation patterns and 
seek subtle clinical overlaps. Such stratification may support 
exome analysis for common genetic causes, and moreover iden- 
tify further epimutations that may account for some of their 
additional clinical features. It may also be informative to 
compare epigenotype patterns among patients of different 
genetic aetiologies. In this regard, it is interesting that an epigen- 
etic analysis of a patient w^hose mother had an NLRP7 mutation 
show^ed very limited overlap of affected imprinted genes 
(FAMSOB) alone w^ith our patients, but some shared hypomethy- 
lation of non-imprinted genes w^hich may inform differences in 
clinical presentation.^^ 

Of the three candidate imprinted loci described here, none 
has a w^ell-defined role in either normal physiology or a disease 
process. NHP2L1 is a nuclear protein w^hich plays a role in 
pre-mRNA splicing as a component of the U4/U6-U5 
tri-snRNP^^ and show^s evidence of allele-specific methylation."^^ 
Little is know^n about the function of PPIEL (pseudogene of 
peptidylprolyl isomerase E) but aberrant DNA methylation at 
PPIEL has previously been associated w^ith bipolar disorder w^ith 
a reported strong inverse correlation betw^een gene expression 
and DNA methylation levels of PPIEL.^^ WRB encodes a basic 
nuclear protein of unknow^n function and maps to the region 
associated With, congenital heart disease in Dow^n syndrome.^^ 
The clinical relevance of these loci, if any, is unknow^n. It is pos- 
sible that these genes, or any of the others identified as hypo- 
methylated in our study, could be associated additional clinical 
disorders beyond the eight IDs currently know^n in clinical gen- 
etics. Cardiac disorders have been reported in 9% of a TND 
cohort, and it is possible that analysis of further patients w^ill 
reveal w^hether the involvement of this locus is of clinical 
significance. 

There w^ere several potential limitations to our study. First, 
M^hole genome methylation analysis by array is restrictive to the 
sequences captured on the array: many more candidate 
imprinted regions may have potentially been obtained from 
w^hole genome bisulfite sequencing; second, additional HIL 
cohorts w^ith other IDs may have provided further candidates; 
third, the grouping of disease cases w^as necessary for statistical 
purposes, but may have masked the hypomethylation of less 
strongly-affected loci. For the candidate regions that have been 
identified there are further limitations to expression analysis in 
the form of low^ frequency SNPs and potentially imprinted tran- 
script identification. DNA methylation is only one component 
of the cellular machinery of imprinting, and the methylation sig- 
nature does not necessarily colocate With, the gene(s) under its 
control, or as has been observed in the case of the candidate 
region PPIEL, not even residing w^ithin a CpG island. 

Further w^ork is required to exploit the findings of this study. 
The candidate imprinted loci identified here must be charac- 
terised to determine w^hether their epimutation has any bearing 
on clinical features in the context of HIL or in as-yet unde- 
scribed ID. These or similar patients may be more comprehen- 
sively analysed by w^hole genome bisulfite sequencing to increase 
capture of candidate genes. Greater resolution may also be 
obtained if a bioinformatic pipeHne can be developed for statis- 
tically robust analysis of individuals, rather than groups of 
patients; indeed, such analysis might be the basis for a compre- 
hensive clinical genetic diagnosis of HIL. Analysis of further 
patients may support accurate stratification of patient groups 
v^ith common epigenetic signatures — ^with or w^ithout common 
phenotype. This in turn w^ould support the search for candidate 
trans-acting gene mutations by exome analysis. Identification of 
common DNA motifs in hypomethylated loci may also indicate 



association w^ith common trans-acting factors (by analogy w^ith 
ZFPS7), and such motifs w^ould be the focus for cis-acting muta- 
tions in IDs. Overall, the potential benefits are disproportionate 
to the rarity of the patients being analysed, and may include 
novel insight into the basic mechanisms of human epigenetics, 
as w^ell as novel loci that may be implicated in many other disor- 
ders including Dow^n Syndrome and bipolar disorder. 
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